<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<!--Converted with LaTeX2HTML 96.1-h (September 30, 1996) by Nikos Drakos (nikos@cbl.leeds.ac.uk), CBLU, University of Leeds -->
<HTML>
<HEAD>
<TITLE>Periodicity artefacts</TITLE>
<META NAME="description" CONTENT="Periodicity artefacts">
<META NAME="keywords" CONTENT="Surrogates">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<LINK REL=STYLESHEET HREF="Surrogates.css">
</HEAD>
<BODY bgcolor=#ffffff LANG="EN" >
 <A NAME="tex2html212" HREF="node15.html"><IMG WIDTH=37 HEIGHT=24 ALIGN=BOTTOM ALT="next" SRC="next_motif.gif"></A> <A NAME="tex2html210" HREF="node9.html"><IMG WIDTH=26 HEIGHT=24 ALIGN=BOTTOM ALT="up" SRC="up_motif.gif"></A> <A NAME="tex2html204" HREF="node13.html"><IMG WIDTH=63 HEIGHT=24 ALIGN=BOTTOM ALT="previous" SRC="previous_motif.gif"></A>   <BR>
<B> Next:</B> <A NAME="tex2html213" HREF="node15.html">Iterative multivariate surrogates</A>
<B>Up:</B> <A NAME="tex2html211" HREF="node9.html">Fourier based surrogates</A>
<B> Previous:</B> <A NAME="tex2html205" HREF="node13.html">Example: Southern oscillation index</A>
<BR> <P>
<H2><A NAME="SECTION00045000000000000000">Periodicity artefacts</A></H2>
<A NAME="secperiod">&#160;</A>
<P>
<P>
The randomisation schemes discussed so far all base the quantification of
linear correlations on the Fourier amplitudes of the data. Unfortunately, this
is not exactly what we want. Remember that the autocorrelation structure given
by
<BR><A NAME="eqautocor">&#160;</A><IMG WIDTH=500 HEIGHT=48 ALIGN=BOTTOM ALT="equation1047" SRC="img55.gif"><BR>
corresponds to the Fourier amplitudes <EM>only</EM> if the time series is one
period of a sequence that repeats itself every <I>N</I> time steps. This is, however,
not what we believe to be the case. Neither is it compatible with the null
hypothesis. Conserving the Fourier amplitudes of the data means that the <EM>
periodic</EM> auto-covariance function
<BR><A NAME="eqcp">&#160;</A><IMG WIDTH=500 HEIGHT=46 ALIGN=BOTTOM ALT="equation1049" SRC="img56.gif"><BR>
is reproduced, rather than <IMG WIDTH=31 HEIGHT=24 ALIGN=MIDDLE ALT="tex2html_wrap_inline2060" SRC="img57.gif">. This seemingly harmless difference can
lead to serious artefacts in the surrogates, and, consequently, spurious
rejections in a test.  In particular, any mismatch between the beginning and
the end of a time series poses problems, as discussed e.g. in
Ref.&nbsp;[<A HREF="node36.html#theiler-sfi">7</A>]. In spectral estimation, problems caused by edge
effects are dealt with by windowing and zero padding. None of these techniques
have been successfully implemented for the phase randomisation of surrogates
since they destroy the invertibility of the transform.
<P>
<blockquote><A NAME="937">&#160;</A><IMG WIDTH=338 HEIGHT=252 ALIGN=BOTTOM ALT="figure1046" SRC="img54.gif"><BR>
<STRONG>Figure:</STRONG> <A NAME="figend">&#160;</A>
   Effect of end point mismatch on Fourier based surrogates. Upper trace:
   1500 iterates of <IMG WIDTH=220 HEIGHT=23 ALIGN=MIDDLE ALT="tex2html_wrap_inline2056" SRC="img53.gif">. Lower trace:
   a surrogate sequence with the same Fourier amplitudes. Observe the
   additional ``crinkliness'' of the surrogate.<BR>
</blockquote><P>

Let us illustrate the artefact generated by an end point mismatch with an
example. In order to generate an effect that is large enough to be detected
visually, consider 1500 iterates of the almost unstable AR(2) process,
<IMG WIDTH=220 HEIGHT=23 ALIGN=MIDDLE ALT="tex2html_wrap_inline2056" SRC="img53.gif"> (upper trace of Fig.&nbsp;<A HREF="node14.html#figend">6</A>).
The sequence is highly correlated and there is a rather big difference between
the first and the last points. Upon periodic continuation, we see a jump
between <IMG WIDTH=31 HEIGHT=14 ALIGN=MIDDLE ALT="tex2html_wrap_inline2064" SRC="img58.gif"> and <IMG WIDTH=11 HEIGHT=14 ALIGN=MIDDLE ALT="tex2html_wrap_inline2066" SRC="img59.gif">. Such a jump has spectral power at all
frequencies but with delicately tuned phases.  In surrogate time series
conserving the Fourier amplitudes, the phases are randomised and the spectral
content of the jump is spread in time. In the surrogate sequence shown as the
lower trace in Fig.&nbsp;<A HREF="node14.html#figend">6</A>, the additional spectral power is mainly
visible as a high frequency component.  It is quite clear that the difference
between the data and such surrogates will be easily been picked up by, say, a
nonlinear predictor, and can lead to spurious rejections of the null
hypothesis.
<P>
<P><blockquote><A NAME="940">&#160;</A><IMG WIDTH=338 HEIGHT=252 ALIGN=BOTTOM ALT="figure1051" SRC="img60.gif"><BR>
<STRONG>Figure:</STRONG> <A NAME="figendno">&#160;</A>
   Repair of end point mismatch by selecting a sub-sequence of length 1350 of
   the signal shown in Fig.&nbsp;<A HREF="node14.html#figend">6</A> that has an almost perfect match of 
   end points. The surrogate shows no spurious high frequency structure.<BR>
</blockquote><P>
<P>
The problem of non-matching ends can often be overcome by choosing a
sub-interval of the recording such that the end points do match as closely as
possible&nbsp;[<A HREF="node36.html#t_neuro">33</A>]. The possibly remaining finite phase slip at the
matching points usually is of lesser importance. It can become dominant,
though, if the signal is otherwise rather smooth.  As a systematic strategy,
let us propose to measure the end point mismatch by
<BR><A NAME="eqmismatch">&#160;</A><IMG WIDTH=500 HEIGHT=43 ALIGN=BOTTOM ALT="equation1052" SRC="img61.gif"><BR>
and the mismatch in the first derivative by
<BR><A NAME="eqslip">&#160;</A><IMG WIDTH=500 HEIGHT=43 ALIGN=BOTTOM ALT="equation1054" SRC="img62.gif"><BR>
The fractions <IMG WIDTH=37 HEIGHT=15 ALIGN=MIDDLE ALT="tex2html_wrap_inline2068" SRC="img63.gif"> and <IMG WIDTH=27 HEIGHT=14 ALIGN=MIDDLE ALT="tex2html_wrap_inline2070" SRC="img64.gif">
give the contributions to the total power of the series of the mismatch of the
end points and the first derivatives, respectively. For the series shown in
Fig.&nbsp;<A HREF="node14.html#figend">6</A>, <IMG WIDTH=99 HEIGHT=25 ALIGN=MIDDLE ALT="tex2html_wrap_inline2072" SRC="img65.gif"> and the end effect
dominates the high frequency end of the spectrum.  By systematically going
through shorter and shorter sub-sequences of the data, we find that a segment
of 1350 points starting at sample 102 yields <IMG WIDTH=103 HEIGHT=29 ALIGN=MIDDLE ALT="tex2html_wrap_inline2074" SRC="img66.gif"> or an almost perfect match.  That sequence is shown as the
upper trace of Fig.&nbsp;<A HREF="node14.html#figendno">7</A>, together with a surrogate (lower
trace). The spurious ``crinkliness'' is removed.
<P>
In practical situations, the matching of end points is a simple and mostly
sufficient precaution that should not be neglected. Let us mention that the SOI
data discussed before is rather well behaved with little end-to-end mismatch
(<IMG WIDTH=107 HEIGHT=25 ALIGN=MIDDLE ALT="tex2html_wrap_inline2076" SRC="img67.gif">).  Therefore we didn't have to worry
about the periodicity artefact.
<P>
The only method that has been proposed so far that strictly implements
<IMG WIDTH=31 HEIGHT=24 ALIGN=MIDDLE ALT="tex2html_wrap_inline2060" SRC="img57.gif"> rather than <IMG WIDTH=37 HEIGHT=24 ALIGN=MIDDLE ALT="tex2html_wrap_inline2080" SRC="img68.gif"> is given in Ref.&nbsp;[<A HREF="node36.html#anneal">26</A>] and will be
discussed in detail in Sec.&nbsp;<A HREF="node16.html#secanneal">5</A> below. The method is very accurate
but also rather costly in terms of computer time.  It should be used in cases
of doubt and whenever a suitable sub-sequence cannot be found.
<P>
<HR><A NAME="tex2html212" HREF="node15.html"><IMG WIDTH=37 HEIGHT=24 ALIGN=BOTTOM ALT="next" SRC="next_motif.gif"></A> <A NAME="tex2html210" HREF="node9.html"><IMG WIDTH=26 HEIGHT=24 ALIGN=BOTTOM ALT="up" SRC="up_motif.gif"></A> <A NAME="tex2html204" HREF="node13.html"><IMG WIDTH=63 HEIGHT=24 ALIGN=BOTTOM ALT="previous" SRC="previous_motif.gif"></A>   <BR>
<B> Next:</B> <A NAME="tex2html213" HREF="node15.html">Iterative multivariate surrogates</A>
<B>Up:</B> <A NAME="tex2html211" HREF="node9.html">Fourier based surrogates</A>
<B> Previous:</B> <A NAME="tex2html205" HREF="node13.html">Example: Southern oscillation index</A>
<P><ADDRESS>
<I>Thomas Schreiber <BR>
Mon Aug 30 17:31:48 CEST 1999</I>
</ADDRESS>
</BODY>
</HTML>
